Newest 'scikit-learn class-imbalance' Questions

2votes

1answer

69views

Taking into account instance cost in learning?

I am generally trying to take into account costs in learning. The set-up is as follows: a statistical learning problem with usuall X and y, where y is imbalanced (roughly 1% of ones). Scikit learn ...

Lucas Morin

2,614

asked Dec 27, 2024 at 9:47

0votes

1answer

136views

Imbalanced Cost-Sensitive Learning Workflow - How to split the data, tune hyperparameters and apply adecision threshold?

I am facing a problem with imbalanced dataset in which I would like to detect the rare event. My questions are more of general strategy about the whole workflow and I would like to hear your thoughts ...

GeorgeM

3

asked Jul 21, 2024 at 9:44

-1votes

1answer

61views

How to deal with a heavily imbalanced test dataset?

Both my train data and test data were imbalanced. So I tried SMOTE for training. Before Smote: ...

GrGr11

1

asked Mar 1, 2024 at 20:03

4votes

2answers

2kviews

Flipping the labels in a binary classification gives different model and results

I have an imbalanced dataset and I want to train a binary classifier to model the dataset. Here was my approach which resulted into (relatively) acceptable performance: 1- I made a random split to get ...

Farzad

43

asked Nov 3, 2022 at 14:38

1vote

0answers

1kviews

Downsampling in sklearn. Test and Train performance question

I have a class imbalanced data set, and have the following set up to handle class imbalance. I first split to test and train and only perform downsampling on the training set and then get the test ...

bananaboy

131

asked Oct 27, 2022 at 16:59

1vote

2answers

641views

Evaluation Metric for Imbalanced and Ordinal Classification

I'm looking for an ML evaluation metric that would work well with imbalanced and ordinal multiclass datasets: Imagine you want to predict the severity of a disease that has 4 grades of severity where ...

Fabio Magarelli

111

asked Feb 3, 2022 at 12:08

2votes

1answer

2kviews

Imbalanced data set with Sample weighting - How to interpret the performance metrics?

Consider a binary classification scenario whereby the True class (5%) is severely outbalanced to the False class (95%). My data set contains numeric data. I am using SKLearn and trying some different ...

Jurgen Cuschieri

129

asked Jan 14, 2022 at 1:02

0votes

1answer

1kviews

roc_auc_score from sk-learn gives error when test label vector with classes has only a subset of the whole set

I have an imbalanced dataset. Does it make sense to compute the roc-auc for the classifier I created in a holdout set? Here's very artificial MWE: ...

An old man in the sea.

173

asked Nov 22, 2021 at 17:25

1vote

1answer

251views

Imbalanced classification task – Discrepancy between learning curves and test set evaluation

I have a binary classification task related to customer churn for a bank. The dataset contains 10,000 instances and 11 features. The target variable is imbalanced (80% remained as customers (0), 20% ...

KK_o7

67

asked Nov 21, 2021 at 12:48

2votes

1answer

367views

Training is not stable with extreme class imbalance

I'm dealing with a multi-class classification problem with around 30 categories. This problem has a severe class imbalance: Around 300 examples for the least common class. Around 100k examples for ...

David Masip

6,116

asked May 4, 2021 at 10:54

0votes

1answer

707views

Logistic regression with unbalanced data, scoring based only on rare class

I have a dataset off app. 600.000 data points in which 0.2% (1.200 samples) is labelled as signifying a rare event. I want to use logistic regression to help me predict this rare event, but even when ...

Nick W

15

asked Apr 21, 2021 at 8:24

0votes

1answer

28views

Unbalanced training set from balanced data

I am looking to get an unbalanced training set with a given ratio of classA:classB from a dataset without regarding if it is balanced or not. The point is to analyze the influence of data imbalance on ...

jelczyn

1

asked Apr 9, 2021 at 9:22

0votes

1answer

2kviews

How does class_weight work in Decision Tree?

I am interested in Cost-Sensitive learning. And I am trying to understand how class_weight in DecisionTree works in terms of math. I read a lot of articles that ...

Marni

21

asked Mar 21, 2021 at 10:06

0votes

2answers

974views

GridSearch on imbalanced datasets

Im trying to use gridsearch to find the best parameter for my model. Knowing that I have to implement nearmiss undersampling method while doing cross validation, should I fit my gridsearch on my ...

Valentin

135

asked Feb 16, 2021 at 7:55

Stack Exchange Network

All Questions

Taking into account instance cost in learning?

Imbalanced Cost-Sensitive Learning Workflow - How to split the data, tune hyperparameters and apply adecision threshold?

How to deal with a heavily imbalanced test dataset?

Flipping the labels in a binary classification gives different model and results

Downsampling in sklearn. Test and Train performance question

Evaluation Metric for Imbalanced and Ordinal Classification

Imbalanced data set with Sample weighting - How to interpret the performance metrics?

roc_auc_score from sk-learn gives error when test label vector with classes has only a subset of the whole set

Imbalanced classification task – Discrepancy between learning curves and test set evaluation

Training is not stable with extreme class imbalance

Logistic regression with unbalanced data, scoring based only on rare class

Unbalanced training set from balanced data

How does class_weight work in Decision Tree?

GridSearch on imbalanced datasets

Hot Network Questions

All Questions

Related Tags